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Abstract 

Nonnegative matrix factorization (NMF) factorizes a non-negative matrix into product of two 
non-negative matrices, namely a signal matrix and a mixing matrix. NMF suffers from the scale 
and ordering ambiguities. Often, the source signals can be monotonous in nature. For example, in 
source separation problem, the source signals can be monotonously increasing or decreasing while 
the mixing matrix can have nonnegative entries. NMF methods may not be effective for such 
cases as it suffers from the ordering ambiguity. This paper proposes an approach to incorporate 
notion of monotonicity in NMF, labeled as monotonous NMF. An algorithm based on alternating 
least-squares is proposed for recovering monotonous signals from a data matrix. Further, the 
assumption on mixing matrix is relaxed to extend monotonous NMF for data matrix with real 
numbers as entries. The approach is illustrated using synthetic noisy data. The results obtained by 
monotonous NMF are compared with standard NMF algorithms in the literature, and it is shown 
that monotonous NMF estimates source signals well in comparison to standard NMF algorithms 
when the underlying sources signals are monotonous. 

Keywords: Nonnegative matrix factorization, Monotonicity, Unsupervised learning, Blind source 

separation 


1 Introduction 


Nonnegative matrix factorization (NMF) is one of the widely used matrix factorization techniques 
with application areas ranging from basic sciences such as chemistry, environmental science, systems 
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biology to image and video processing, blind source separation, text mining, social network analysis 
Gasmans]. The reasons of wide applicability of NMF are two folds: (i) Non-negativity feature is 
prevalent in real world data, and (ii) the latent factors estimated by NMF are easily interpretable. 
Further, the seminal paper by Lee and Seung ia m has helped in popularizing NMF in various fields. 

In unsupervised learning methods, the objective is to extract signals or features from the given data. 
For example, in blind source separation (BSS), the objective is to identify the underlying source signals 
from noisy data. NMF has been routinely applied to separate source signals from noisy data mm- 
NMF decomposes a data matrix into a mixing matrix (coefficients of signals), and a source signal 
matrix. Since NMF suffers from ordering and scaling ambiguities, NMF factorization is not unique 
mm- To obtain appropriate solution, constraint such as sparsity has been incorporated in NMF 
mm- Further, semi-NMF imposes non-negativity constraints only on the source signals and allows 
the mixed sign entries in data and mixing matrices [4|. 

Many algorithms have been proposed to improve the speed and convergence for NMF. One of the most 
commonly used algorithms is multiplicative update rules and its variants proposed in mm- Recently, 
several algorithms in alternating least-squares framework combined with numerical optimization tech¬ 
niques, active-set method [8], projected gradient methods HD, quadratic programming [16] etc., have 
been proposed to improve speed and convergence for NMF. Further, necessary and sufficient condi¬ 
tions for unique NMF decomposition have been also investigated mm- Although there have been 
considerable work on improvement of algorithms for better solutions, best of our knowledge, there is 
no attempt on solving monotonicity in signals in NMF. 

Source signals in chemistry and systems biology often exhibit monotonous property. In such scenarios, 
the formulation of monotonicity constraints is important for recovering source signal using NMF or 
semi-NMF. In this paper, it is proposed to investigate how to resolve monotonicity in NMF. We will 
propose a new approach, called monotonous NMF, for recovering monotonous source signals from noisy 
data. Further, we will extend this approach to semi-NMF. Two algorithms for monotonous (semi-) 
NMF are also proposed. The approach is demonstrated on synthetic data. Moreover, future work on 
incorporation of sparseness and integer entries in mixing matrix will be discussed. 

The rest of this paper is organized as follows. In Section [2j the matrix factorization and NMF related 
background are introduced. We propose extensions of NMF which resolves monotonicity of source 
signal in Section [3l Section [4] illustrates monotonous NMF on simulation studies based on synthetic 
data sets. Further, it compares the performance of the proposed monotonous NMF with two NMF 
algorithms in the existing literature. Section [5] concludes the paper. Next, the notations used in the 
paper are described. 
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1.1 Notations 


The bold capital and small letters are used for defining matrices and vectors; the small italic letters are 
used to define scaler quantities; w j and h ?j are used for denoting column and row vectors, respectively; 
Q (i,j) denotes ( i,j)th element of matrix Q; p(.) indicates permutations; vec(Q) indicates the vector- 
ization of a matrix Q which converts the matrix into a column vector; (-) + indicates pseudo-inverse of 
matrix; Q T is transpose of matrix Q; M mxn indicates m x n-dimensional matrices; || • || denotes Frobe- 
nius norm; || • ||i denotes L \-norm of a vector; Z is integers; indicates nonnegative real numbers; 
denotes re X re-dimensional symmetric positive semidefinite matrices; <S> denotes Kronecker products; 
I n is an re-dimensional identity matrix; 0 n denotes vector of length re with zeros as elements; > or < 
indicates component-wise inequality for matrices and vectors. 


2 Matrix Factorization 

A data matrix Z G R nxm can be factorized as product of W G M nxs and H G M sxm for s « min(m, re) 
as follows: 


Z«WH. (1) 

In BSS problems, W is unknown mixing matrix and H is source signal matrix containing unknown 
signals. Then, rre is the number of observations, re the number of samples, and s is the number of 
sources. Note that Eq. JT]) considers approximate factorization of the data matrix. Since physical 
systems are often corrupted by noise, the exact factorization of Z into W and H of the data should 
not be attempted. In presence of noise, the exact factorization of Z can be written as 

Z = W H + E, (2) 

where E is an m x re—dimensional matrix containing a contribution of noise in a given data. 

IfW is known and rank (W) = s, then the least-squares solution of the unknown signal matrix is given 
by H = W + Z. However, W is often not known, and the objective of BSS problems is to estimate 
the unknown mixing matrix and signal matrix from the given data matrix. Note that BSS problems 
can be also viewed as unsupervised learning methods where the objective is to extract useful features 
from the given data. To accomplish this task, a prior knowledge about the underlying system such 
as non-negativity, and sparsity are imposed as constraints. These constraints lead to factorization of 
the data matrix with meaningful physical insights. The non-negativity constraint leads to popular 
non-negative matrix factorization (NMF). In the next section, NMF is described briefly. 
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2.1 Nonnegative matrix factorization (NMF) 


NMF aims to factorize the Z into two nonnegative matrices W £ M" xs and H £ M+ Xm . In NMF, the 
following problem is solved to obtain W and H 


min IIZ - WHII 

W. H 

st W > 0, H > 0. 


(3) 


The imposing of nonnegative constraints leads to only additive combinations of several rank-one fac¬ 
tors to represent the data. Several algorithms have been proposed in the literature to solve Eq. m, 
efficiently. However, NMF solution suffers from the following ambiguities m- 


• Scale ambiguity: Consider a non-singular matrix T £ M(j_ xs . Then, the factorization of Z can 
also represented as: 

ZwWH = (WT)(r 1 H) = W T H T . (4) 

Eq. (|4]) indicates that W t and Hy are also solution of the problem Q. Hence, NMF leads to a 
local optimum of the solution depending on the initialization, and provides numerous solutions 
to the problem ([3]). 

• Ordering: Note that Z ~ = zC?; w ..p(i)^ 1 p(*),. with £>(•) is an ordering (or permutation). 

Hence, it is not possible to recover the order of the columns of W and of the rows H. 


Therefore, the formulation of NMF in Eq. ([3]) has many solutions. Additional constraints such as 
sparsity, partial or full knowledge of noisy W need to be imposed to reduce the number solutions or 
to obtain a unique solution of the problem ([3]). These constraints depend on the applications and 
underlying physical system. Often, the signal sources are in increasing or decreasing orders , i.e., 
monotonous in nature. In such case, each row of H contains ordered elements. Hence, the monotonous 
constraint can be added in the optimization problem ([3]) for recovering these signals from the data 
matrix. The main objective of this paper is to develop an approach based on NMF for recovering 
monotonous signal sources. This leads to a novel method, called monotonous NMF, which is described 
next. 


3 Monotonous NMF 


In this section, the monotonous constraints on the source signals will be formulated and imposed on 
the factorization problem ([3]). Consider the ith row of H matrix, h, with i = 1,..., s, whose elements 
are monotonously increasing or decreasing order. Without loss of generality, we assume that out of 
the s rows, the first c rows have entries which are monotonously increasing, while the remaining rows 
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have entries that are monotonously decreasing. Then, the following relationships hold between the 
elements: 

hk,i < hk '2 < • • • < hk, m , k = 1,..., c, (increasing order) 

(5) 

hi ,i > hi ,2 > • • • > hi, m , l = c + 1,..., s, (decreasing order). 

The optimization problem in Eq. 0 can be reformulated by imposing the constraints in Eq. ([5]) as: 

min IIZ - WHII 

W, H 

st W > 0 , H > 0 , .... 

( 6 ) 

hk, 1 ^ hk,2 f; • • • ^ hk,mi k — 1, . . . , C, 

hi, 1 > hi ,2 > ■■■> h, m . I = (c + 1),..., s. 

The formulation in Eq. ([6]) is nonconvex optimization problem. The solution to the optimization 
problem Q can be obtained by decomposing it into two subproblems. These problems can be solved 
alternately till the solution converges. This approach is known as alternating least squares (ALS) in 
the literature mm- The two subproblems can be formulated as follows: 


• Given H: 

min IIZ - WHII 
w 

st W > 0 


• Given W: 

min IIZ - WHII 

H 

st H > 0 

hk, 1 hk ,2 ^ ^ hk,mi k 1 ,..., c, 

hi ,i > hi, 2 > > hi, m . I = (c + 1), ..., d. 


(7) 


( 8 ) 


Note that objective functions in Eqs. (0 and (J8]) are quadratic functions. If we can formulate these 
subproblems in terms of standard quadratic programming as follows: 

min -x T Px + f T x 

st Gx < 1 (9) 

Kx = r 

where x £ R p , P £ G £ M txp , I £ R p , K £ R fcxp , and r £ R 6 , then it can be shown that both 
subproblem are convex optimization problems [2]. The standard ALS framework can be applied to 
solve the problems 0 0 . We reformulate the problems 0 -0 in terms of standard form 0 as: 
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• Given H: 


minvec(W) T (HH T <g>I n )vec(W) — 2 vec(ZH T ) T vec(W) 

w 

st 

Ins vec(W) < O ns 


(10) 


• Given W: 


min vec (H) T (I m ® W T W) vec(H) - 2 vec(Z T W) T vec(H) 

H 

St 

_I A ms l vec(H) < O m , 


( 11 ) 


where A G R( m 1 ) sxms i s matrix having the form 


A = 


D i 


Do 


D s 


. The entries not shown in A are zeros. The matrices D*, G R( m 1 ^ xm , k = 


1,2,...,c and D/ G R( m 1 ) xm ) l = c+ 1,,... ,s are defined as follows: 


D 


k 


1 -1 
1 -1 


1 -1 
1 -1 


D 


i — 


-1 1 
-1 1 


-1 1 
-1 1 


Note that D& and are Toeplitz matrices with the first rows being [1 — 1 0 ... 0] and [— 1 1 0 ... 0], 
respectively. 

Since H > 0 and W > 0, the matrices (HH T ® I n ) and (I m ® W T W) in Eqs. m and (HD are 
positive semi-definite, and of full rank s. Further, both subproblems (HOD (I11D are subjected to linear 
inequality constraints. Hence, the objective problems in (HOD — (fTTli are quadratic programming (QP) 
of form Q. Consequently, they are convex optimization problems [2]. The optimal solutions of the 
problems in Eqs. m CD exist. Standard QP algorithms such as active-set methods, interior-point 
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methods etc mm can be used to solve the optimization problems (fTOli dill) in the ALS framework. 
The implementation of these algorithms is available in in commercial scientific packages. 

Next, we present an algorithm to solve the optimization problems (flOl) (fill) for estimating W and 
H given Z. Further, one can impose the upper bounds on the elements of W and H based on the 
data matrix and the underlying physical problem. Algorithm Q] presents an implementation for solving 
monotonous NMF problem ([6]) in the ALS framework. 


Data: Z, number of source signals (s << min (m,n)) 

Result: W, H 

initialization: W M , H old ; 

while 11 ~W old. — W neu ,11 > tolerance and ||H 0 ;d — H new || > tolerance do 
read current; 

W new <= Solution of (flOl) 

H new <£= Solution of (fTTT) 

end 

Return W — W^ eu ,, H — tt new 

Algorithm 1: Algorithm for Monotonous NMF 


In Algorithm [TJ H can be normalized after each iteration. Then, W has to be scaled in appropriate 
manner before proceeding to next iteration. Further, note that the subproblem (1101) is a standard 
subproblem in NMF algorithms based on the ALS framework. Then, the multiplicative rules proposed 
in mm can a ^ so be use d to update W instead of solving the optimization (HOD in the ALS framework. 
It has been shown that the algorithms in the ALS framework (such as Algorithm [T]) decrease the 
function value at each iteration because the ALS framework produces stationary point at each iteration 
mm ■ Hence, Algorithm [T] converges to a solution but note that the convergence speed may be slower. 


3.1 Monotonous Semi-NMF 

This section demonstrates how idea of monotonous NMF can be extended by relaxing nonnegative 
constraints on the mixing matrix, W and Z. This will allow to apply monotonous NMF to data 
matrix consisting negative entries. When we relax nonnegative constraints, W entries can have both 
positive or negative signs. This kind of NMF is referred as semi-NMF in the literature [3]. However, the 
entries of signal matrix are still nonnegative and monotonous. This relaxation on the mixing matrix 
allows the non-positive entries in data matrix. The optimization problem © can be reformulated 
without the non-negativity constraints as follows: 

min || Z — WH||. (12) 
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For given H, Eq. (I12[) is a least-squares problem, and hence, the analytical update rule can be obtained 
as 

W = ZH T (HH T ) + (13) 

In Eq. (USD, HH T is positive semidefinite matrix and its inverse (or pseudo-inverse) exists. With this 
update rule, a modified version of Algorithm Q] is given in Algorithm [2] 


Data: Z, number of source signals (s << min (m,n)) 

Result: W, H 
initialization: W oW , H old ; 

while 11W 0 Zrf — W reeTO || > tolerance and ||H 0 /^ — H„ e „,|| > tolerance do 
read current; 

W new <5= ZH- d (H oM H;( M ) + 

H new -<= Solution of (flTT) 
end 

Return W — newi H — H weu , 

Algorithm 2: Algorithm for Semi-NMF 


This extension is in line of work done in [3j. Further, Algorithm [2] converges to a solution as we have 
discussed for Algorithm [I] 


4 Illustrative example 


In this section, we demonstrate monotonous NMF on synthetic data involving three source signals. We 
have performed simulation studies for two scenarios: (SI) monotonically increasing source signals, and 
(S2) mixed monotonous source signals (two increasing signals, one decreasing signal). The noise-free 
data matrix is constructed in the following manner. Three monotonically source signals are considered 
as shown in Figures Q] and [2] for Scenarios SI and S2, respectively. Fifty sample points of signals are 
available in both scenarios. The mixing matrix of size (8 x 3) is generated from uniform distribution 
in the interval (0,1). The noise-free data matrix Z of size (8 x 50) is obtained by multiplication of the 
mixing matrix and signal source matrix. To generate the noisy data, 5% random uniformly distributed 
noise is added to the noisy-free data. All simulations are performed in MATLAB. For Algorithm 1, 
"quadprog" function is used to solve the problems (10)-(11). 

The following three methods are applied to both scenarios: (i) Monotonous NMF (labeled MNMF), 
(ii) NMF using multiplicative rules (labeled NNMF) (MATLAB in-built function), and (iii) fast NMF 
(labeled NMF) algorithm based on active-set methods in 0 with the assumption of three source signals. 
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Figure 1: Source si§nals,'ex4ractbd from the iroisy data for Scenario SI 


35 40 45 


Scenario SI The normalized source signals obtained by applying three methods along with the true 
one are shown in Figure |J]for Scenario SI. The reconstruction errors (||Z — WH||) are 0.1197, 0.1564 
and, 0.9835 for MNMF, NNMF, and NMF, respectively. This result shows that MNMF performs well 
in comparison of NNMF and NMF. It should be noted that NMF estimates rank-deficient source signal 
matrix (rank (H) = 1 for NMF). In other words, NMF factorizes the data matrix into single one-rank 
factor instead of three one-rank factors, while MNMF and NNMF factorize the data matrix into three 
rank-one factors. However, NNMF fails to capture monotonous behaviour of source signals. 


Scenario S2 The normalized source signals obtained using the three methods along with the true one 
are shown in Figure [2] for Scenario S2. The reconstruction errors (||Z — WH||) are 0.1189, 0.1399 and, 
0.5736 for MNMF, NNMF, and NMF, respectively. In this case, MNMF performs better in comparison 
to other methods to capture both kinds of monotonous signals. In this case, NMF estimates rank- 
deficient source signal matrix, however, the rank of H has increased by one rank (H) = 2 for NMF. 
Note that NNMF fails to capture monotonous behaviour of source signals in Scenario 2 too. 


5 Conclusions 


The paper has proposed an approach to incorporate notion of monotonicity in NMF. The new extension 
is called as monotonous NMF. Monotonous NMF has been applied to recover monotonous source 
signals from noisy data. Further, we have extended monotonous NMF to semi-NMF by relaxing non¬ 
negativity constraints on the mixing matrix. Algorithms for monotonous (semi-)NMF using quadratic 
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programming have been proposed in the ALS framework. The illustrative examples show that the 
monotonous NMF performs better in comparison to the algorithms of NMF in the literature when the 
source signals exhibit monotonous behaviour. This indicates the importance of nronotonicity constraint 
in NMF. 
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